Stopping Criterion for Boosting-Based Data Reduction Techniques: from Binary to Multiclass Problem
نویسندگان
چکیده
So far, boosting has been used to improve the quality of moderately accurate learning algorithms, by weighting and combining many of their weak hypotheses into a final classifier with theoretically high accuracy. In a recent work (Sebban, Nock and Lallich, 2001), we have attempted to adapt boosting properties to data reduction techniques. In this particular context, the objective was not only to improve the success rate, but also to reduce the time and space complexities due to the storage requirements of some costly learning algorithms, such as nearest-neighbor classifiers. In that framework, each weak hypothesis, which is usually built and weighted from the learning set, is replaced by a single learning instance. The weight given by boosting defines in that case the relevance of the instance, and a statistical test allows one to decide whether it can be discarded without damaging further classification tasks. In Sebban, Nock and Lallich (2001), we addressed problems with two classes. It is the aim of the present paper to relax the class constraint, and extend our contribution to multiclass problems. Beyond data reduction, experimental results are also provided on twenty-three datasets, showing the benefits that our boosting-derived weighting rule brings to weighted nearest neighbor classifiers.
منابع مشابه
Stopping Criterion for Boosting-Based Data Reduction Techniques: from Binary to Multiclass Problems
So far, boosting has been used to improve the quality of moderately accurate learning algorithms, by weighting and combining many of their weak hypotheses into a final classifier with theoretically high accuracy. In a recent work (Sebban, Nock and Lallich, 2001), we have attempted to adapt boosting properties to data reduction techniques. In this particular context, the objective was not only t...
متن کاملS Tudents ’ P Erformance P Rediction S Ystem Using M Ulti a Gent Data M Ining T Echnique
A high prediction accuracy of the students’ performance is more helpful to identify the low performance students at the beginning of the learning process. Data mining is used to attain this objective. Data mining techniques are used to discover models or patterns of data, and it is much helpful in the decision-making. Boosting technique is the most popular techniques for constructing ensembles ...
متن کاملTotally Corrective Multiclass Boosting with Binary Weak Learners
In this work, we propose a new optimization framework for multiclass boosting learning. In the literature, AdaBoost.MO and AdaBoost.ECC are the two successful multiclass boosting algorithms, which can use binary weak learners. We explicitly derive these two algorithms’ Lagrange dual problems based on their regularized loss functions. We show that the Lagrange dual formulations enable us to desi...
متن کاملLearning with Equivalence Constraints and the Relation to Multiclass Learning
We study the problem of learning partitions using equivalence constraints as input. This is a binary classification problem in the product space of pairs of datapoints. The training data includes pairs of datapoints which are labeled as coming from the same class or not. This kind of data appears naturally in applications where explicit labeling of datapoints is hard to get, but relations betwe...
متن کاملMulticlass Boosting with Hinge Loss based on Output Coding
Multiclass classification is an important and fundamental problem in machine learning. A popular family of multiclass classification methods belongs to reducing multiclass to binary based on output coding. Several multiclass boosting algorithms have been proposed to learn the coding matrix and the associated binary classifiers in a problemdependent way. These algorithms can be unified under a s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Machine Learning Research
دوره 3 شماره
صفحات -
تاریخ انتشار 2002